AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.61)

Apolinario, Marco Paul E., Roy, Kaushik

LANCE: Low Rank Activation Compression for Efficient On-Device Continual Learning

arXiv.org Artificial IntelligenceSep-29-2025

On-device learning is essential for personalization, privacy, and long-term adaptation in resource-constrained environments. Achieving this requires efficient learning, both fine-tuning existing models and continually acquiring new tasks without catastrophic forgetting. Yet both settings are constrained by high memory cost of storing activations during backpropagation. Existing activation compression methods reduce this cost but relying on repeated low-rank decompositions, introducing computational overhead. Also, such methods have not been explored for continual learning. We propose LANCE (Low-rank Activation Compression), a framework that performs one-shot higher-order Singular Value Decompsoition (SVD) to obtain a reusable low-rank subspace for activation projection. This eliminates repeated decompositions, reducing both memory and computation. Moreover, fixed low-rank subspaces further enable on-device continual learning by allocating tasks to orthogonal subspaces without storing large task-specific matrices. Experiments show that LANCE reduces activation storage up to 250$\times$ while maintaining accuracy comparable to full backpropagation on CIFAR-10/100, Oxford-IIIT Pets, Flowers102, and CUB-200 datasets. On continual learning benchmarks (Split CIFAR-100, Split MiniImageNet, 5-Datasets), it achieves performance competitive with orthogonal gradient projection methods at a fraction of the memory cost. These results position LANCE as a practical and scalable solution for efficient fine-tuning and continual learning on edge devices.

artificial intelligence, lance, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2509.21617

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsAug-15-2025, 18:41:04 GMT

Low-Rank Subspaces in GANs Supplementary Material

Fourth, Sec. 5 gives more results on controlling diverse regions or on other

artificial intelligence, machine learning, reference image, (17 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Hong Kong (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.99)
Information Technology > Artificial Intelligence > Vision (0.69)

Neural Information Processing SystemsAug-15-2025, 18:41:01 GMT

8b4066554730ddfaa0266346bdc1b202-Paper.pdf

artificial intelligence, machine learning, subspace, (18 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

arXiv.org Machine LearningApr-9-2025

Sculpting Subspaces: Constrained Full Fine-Tuning in LLMs for Continual Learning

Nayak, Nikhil Shivakumar, Killamsetty, Krishnateja, Han, Ligong, Bhandwaldar, Abhishek, Chanda, Prateek, Xu, Kai, Wang, Hao, Pareja, Aldo, Silkin, Oleg, Eyceoz, Mustafa, Srivastava, Akash

Continual learning in large language models (LLMs) is prone to catastrophic forgetting, where adapting to new tasks significantly degrades performance on previously learned ones. Existing methods typically rely on low-rank, parameter-efficient updates that limit the model's expressivity and introduce additional parameters per task, leading to scalability issues. To address these limitations, we propose a novel continual full fine-tuning approach leveraging adaptive singular value decomposition (SVD). Our method dynamically identifies task-specific low-rank parameter subspaces and constrains updates to be orthogonal to critical directions associated with prior tasks, thus effectively minimizing interference without additional parameter overhead or storing previous task gradients. We evaluate our approach extensively on standard continual learning benchmarks using both encoder-decoder (T5-Large) and decoder-only (LLaMA-2 7B) models, spanning diverse tasks including classification, generation, and reasoning. Empirically, our method achieves state-of-the-art results, up to 7% higher average accuracy than recent baselines like O-LoRA, and notably maintains the model's general linguistic capabilities, instruction-following accuracy, and safety throughout the continual learning process by reducing forgetting to near-negligible levels. Our adaptive SVD framework effectively balances model plasticity and knowledge retention, providing a practical, theoretically grounded, and computationally scalable solution for continual learning scenarios in large language models.

large language model, machine learning, natural language, (18 more...)

2504.07097

Country:

North America > United States (0.14)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Neural Information Processing SystemsJan-15-2025, 05:28:09 GMT

Low-Rank Subspaces in GANs

low-rank factorization, low-rank subspace, subspace, (3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.64)

arXiv.org Artificial IntelligenceJan-11-2024

Discovering Low-rank Subspaces for Language-agnostic Multilingual Representations

Xie, Zhihui, Zhao, Handong, Yu, Tong, Li, Shuai

Large pretrained multilingual language models (ML-LMs) have shown remarkable capabilities of zero-shot cross-lingual transfer, without direct cross-lingual supervision. While these results are promising, follow-up works found that, within the multilingual embedding spaces, there exists strong language identity information which hinders the expression of linguistic factors shared across languages. For semantic tasks like cross-lingual sentence retrieval, it is desired to remove such language identity signals to fully leverage semantic information. In this work, we provide a novel view of projecting away language-specific factors from a multilingual embedding space. Specifically, we discover that there exists a low-rank subspace that primarily encodes information irrelevant to semantics (e.g., syntactic information). To identify this subspace, we present a simple but effective unsupervised method based on singular value decomposition with multiple monolingual corpora as input. Once the subspace is found, we can directly project the original embeddings into the null space to boost language agnosticism without finetuning. We systematically evaluate our method on various tasks including the challenging language-agnostic QA retrieval task. Empirical results show that applying our method consistently leads to improvements over commonly used ML-LMs.

computational linguistic, information, lir, (14 more...)

arXiv.org Artificial Intelligence

2401.05792

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Dominican Republic (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(10 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

arXiv.org Machine LearningOct-31-2018

Low-Rank Embedding of Kernels in Convolutional Neural Networks under Random Shuffling

Li, Chao, Sun, Zhun, Yu, Jinshi, Hou, Ming, Zhao, Qibin

Although the convolutional neural networks (CNNs) have become popular for various image processing and computer vision task recently, it remains a challenging problem to reduce the storage cost of the parameters for resource-limited platforms. In the previous studies, tensor decomposition (TD) has achieved promising compression performance by embedding the kernel of a convolutional layer into a low-rank subspace. However the employment of TD is naively on the kernel or its specified variants. Unlike the conventional approaches, this paper shows that the kernel can be embedded into more general or even random low-rank subspaces. We demonstrate this by compressing the convolutional layers via randomly-shuffled tensor decomposition (RsTD) for a standard classification task using CIFAR-10. In addition, we analyze how the spatial similarity of the training data influences the low-rank structure of the kernels. The experimental results show that the CNN can be significantly compressed even if the kernels are randomly shuffled. Furthermore, the RsTD-based method yields more stable classification accuracy than the conventional TD-based methods in a large range of compression ratios.

artificial intelligence, deep learning, machine learning, (15 more...)

1810.13098

Country:

Asia > Japan (0.14)
Asia > China (0.14)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Marchetti-Bowick, Micol, Lengerich, Benjamin J., Parikh, Ankur P., Xing, Eric P.

Hybrid Subspace Learning for High-Dimensional Data

arXiv.org Machine LearningAug-5-2018

The high-dimensional data setting, in which p >> n, is a challenging statistical paradigm that appears in many real-world problems. In this setting, learning a compact, low-dimensional representation of the data can substantially help distinguish signal from noise. One way to achieve this goal is to perform subspace learning to estimate a small set of latent features that capture the majority of the variance in the original data. Most existing subspace learning models, such as PCA, assume that the data can be fully represented by its embedding in one or more latent subspaces. However, in this work, we argue that this assumption is not suitable for many high-dimensional datasets; often only some variables can easily be projected to a low-dimensional space. We propose a hybrid dimensionality reduction technique in which some features are mapped to a low-dimensional subspace while others remain in the original space. Our model leads to more accurate estimation of the latent space and lower reconstruction error. We present a simple optimization procedure for the resulting biconvex problem and show synthetic data results that demonstrate the advantages of our approach over existing methods. Finally, we demonstrate the effectiveness of this method for extracting meaningful features from both gene expression and video background subtraction datasets.

bioinformatics, data mining, machine learning, (19 more...)

1808.01687

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Biomedical Informatics > Translational Bioinformatics (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.67)

arXiv.org Machine LearningJun-20-2013

Iterative Grassmannian Optimization for Robust Image Alignment

He, Jun, Zhang, Dejiao, Balzano, Laura, Tao, Tao

Robust high-dimensional data processing has witnessed an exciting development in recent years, as theoretical results have shown that it is possible using convex programming to optimize data fit to a low-rank component plus a sparse outlier component. This problem is also known as Robust PCA, and it has found application in many areas of computer vision. In image and video processing and face recognition, the opportunity to process massive image databases is emerging as people upload photo and video data online in unprecedented volumes. However, data quality and consistency is not controlled in any way, and the massiveness of the data poses a serious computational challenge. In this paper we present t-GRASTA, or "Transformed GRASTA (Grassmannian Robust Adaptive Subspace Tracking Algorithm)". t-GRASTA iteratively performs incremental gradient descent constrained to the Grassmann manifold of subspaces in order to simultaneously estimate a decomposition of a collection of images into a low-rank subspace, a sparse part of occlusions and foreground objects, and a transformation such as rotation or translation of the image. We show that t-GRASTA is 4 $\times$ faster than state-of-the-art algorithms, has half the memory requirement, and can achieve alignment for face images as well as jittered camera surveillance images.

artificial intelligence, machine learning, optimization problem, (19 more...)

doi: 10.1016/j.imavis.2014.02.015

1306.0404

Country: North America > United States (0.68)

Genre: Research Report (0.64)

Industry:

Commercial Services & Supplies > Security & Alarm Services (0.46)
Media > Photography (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.86)